|
In computing, the diff utility is a data comparison tool that calculates and displays the differences between two files. Unlike edit distance notions used for other purposes, diff is line-oriented rather than character-oriented, but it is like Levenshtein distance in that it tries to determine the smallest set of deletions and insertions to create one file from the other. The diff command displays the changes made in a standard format, such that both humans and machines can understand the changes and apply them: given one file and the changes, the other file can be created. Typically, diff is used to show the changes between two versions of the same file. Modern implementations also support binary files.〔MacKenzie ''et al.'' "Binary Files and Forcing Text Comparison" in ''Comparing and Merging Files with GNU Diff and Patch''. Downloaded 28 April 2007. ()〕 The output is called a "diff", or a patch, since the output can be applied with the Unix program patch. The output of similar file comparison utilities are also called a "diff"; like the use of the word "grep" for describing the act of searching, the word ''diff'' became a generic term for calculating data difference and the results thereof.〔Eric S. Raymond (ed.), (“diff” ), ''The Jargon File'', version 4.4.7〕 == History == The diff utility was developed in the early 1970s on the Unix operating system which was emerging from Bell Labs in Murray Hill, New Jersey. The final version, first shipped with the 5th Edition of Unix in 1974, was entirely written by Douglas McIlroy. This research was published in a 1976 paper co-written with James W. Hunt who developed an initial prototype of diff. The algorithm this paper described became known as the Hunt–McIlroy algorithm. McIlroy's work was preceded and influenced by Steve Johnson's comparison program on GECOS and Mike Lesk's proof program. proof also originated on Unix and, like diff, produced line-by-line changes and even used angle-brackets (">" and "<") for presenting line insertions and deletions in the program's output. The heuristics used in these early applications were, however, deemed unreliable. The potential usefulness of a diff tool provoked McIlroy into researching and designing a more robust tool that could be used in a variety of tasks but perform well in the processing and size limitations of the PDP-11's hardware. His approach to the problem resulted from collaboration also with individuals at Bell Labs including Alfred Aho, Elliot Pinson, Jeffrey Ullman, and Harold S. Stone. In the context of Unix, the use of the ed line editor provided diff with the natural ability to create machine-usable "edit scripts". These edit scripts, when saved to a file, can, along with the original file, be reconstituted by ed into the modified file in its entirety. This greatly reduced the secondary storage necessary to maintain multiple versions of a file. McIlroy considered writing a post-processor for diff where a variety of output formats could be designed and implemented, but he found it more frugal and simpler to have diff be responsible for generating the syntax and reverse-order input accepted by the ed command. Late in 1984 Larry Wall created a separate utility, patch, releasing its source code on the ''mod.sources'' and ''net.sources'' newsgroups. This program generalized and extended the ability to modify files with output from diff. Modes in Emacs also allow for converting the format of patches and even editing patches interactively. In diff's early years, common uses included comparing changes in the source of software code and markup for technical documents, verifying program debugging output, comparing filesystem listings and analyzing computer assembly code. The output targeted for ed was motivated to provide compression for a sequence of modifications made to a file. The Source Code Control System (SCCS) and its ability to archive revisions emerged in the late 1970s as a consequence of storing edit scripts from diff. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Diff utility」の詳細全文を読む スポンサード リンク
|